Predictors of the macroinvertebrate fauna in semiarid aquatic systems
Authors
Prof. Elvio S. F. Medeiros (Autor)
Rafaela L. de Farias (Autor)
Laryssa K. de Carvalho (Doutoranda)
Affiliation:
Laboratório de Ecologia, Universidade Estadual da Paraíba, Campus V, João Pessoa, PB
Published
February 10, 2026
Abstract
Assemblage structure of freshwater macroinvertebrates in semiarid Brazil: importance of habitat structureHabitat structure is more importand than sediment type as predictor of the macroinvertebrate fauna in semiarid aquatic systems
1 Introduction
2 Material and Methods
2.1 Study Area and Sampling Design
The present study was performed at the easter limits of the semiarid region of Brazil (Figure 1). This area is characterized by low average precipitation, concentrated in a few months of the year, usually between January and July (Figure 2), and high average annual temperatures between 20 and 32 °C (Barbosa et al. 2025). The main hydrological feature in the study area is the intermittence of surface water flow of its streams and rivers and, as a consequence, many man-made reservoirs built from the damming of the intermittent streams (Maltchik and Medeiros 2006). These rivers flow through the “Caatinga”, a deciduous arboreal to shrubby open forest (Silva et al. 2017). The climate in the study area is semiarid (BS´h hot and dry) and equatorial (As, dry summer) (Peel et al. 2007). A diverse array of sampling sites was chosen to represent the most common aquatic environment types of semiarid Brazil, with different sets of specificities and across different catchment basins. We sampled six sites, three of which consisted of stream reaches with surface water flow (during the rainy season) or isolated temporary pools (during the dry season) and three sites in artificial reservoirs created from stream impoundment. Sampling was conducted during the year of 2006 on four occasions during the rainy (April and June) and dry seasons (September and December).
2.2 Data Collection
Environmental characteristics of each site were measured in four sets of variables: (a) site morphology, (b) water quality, (c) sediment composition, and (d) marginal habitat structure. Site morphology was assessed by their width (cm) and depth (cm) measured from three random transects. Catchment scale variables (such as elevation and river length) were measured using handheld GPS and satellite imagery. Water quality was evaluated as physical and chemical variables that were measured using portable equipment for temperature (°C) and dissolved oxygen (mg/L). Transparency (cm) was measured using a Secchi disk and water velocity (m/s) was estimated using the float method (Maitland 1990). Sediment composition and the habitat physical structure followed protocols from Pusey Pusey et al. (2004) and Mugodo et al. (2006), adapted by Medeiros et al. (2008), where they are estimated from 9 to 12 one meter quadrants along the margins (terrestrial-aquatic interface) and determined by visual estimation of sediment type percentage cover (mud, sand, cobbles, small gravel, large gravel, rocks and bedrock) and of littoral and sub-aquatic structures (macrophytes, littoral grass, leaf litter, attached algae, filamentous algae, overhanging vegetation, submerged vegetation, small woody debris, large woody debris and root masses). At each study site, three samples of benthic macroinvertebrates were collected in the margins using a D type net (40 cm wide and 250 μm) to represent different habitat types. Samples were fixed in situ with 4% formalin and subsequently preserved in 70% ethanol. Macroinvertebrate specimens were sorted and identified to the lowest possible taxonomic level, usually family (McCafferty and Provonsha (1983), Mugnai et al. (2010), Thorp and Covich (2010) and Merritt et al. (2019), among others).
2.3 Statistical Analyses
Prior to statistical analysis, environmental variables were checked for multivariate collinearity and square root transformed to enhance normality and homogeneity of variances (Sokal and Rohlf 1995).
Code: Importing and organizing data
#BENTOS2006 - ORGANIZANDO DADOS----#####....----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o console#shell.exec(getwd())getwd()setwd("D:/Elvio/OneDrive/MSS/_Bentos-2006/Bentos2006_Q")library(openxlsx)##CARREGANDO MATRIZES BRUTAS----densidade <-read.xlsx("D:/Elvio/OneDrive/MSS/_Bentos-2006/Bentos2006/zoobentos2006_com.xlsx",rowNames = T,colNames = T,sheet ="densidade")habitat <-read.xlsx("D:/Elvio/OneDrive/MSS/_Bentos-2006/Bentos2006/zoobentos2006_hab.xlsx",rowNames = T,colNames = T,sheet ="ambiente")grupos <-read.xlsx("D:/Elvio/OneDrive/MSS/_Bentos-2006/Bentos2006/zoobentos2006_hab.xlsx",rowNames = T,colNames = T,sheet ="grupos")densidade[1:5,1:5] #[1:5,1:5] mostra apenas as linhas e colunas de 1 a 5.###REMOVENDO LINHAS ZERADAS DE HABITAT----rownames(habitat)del_rows <-c("B-A-MU-1", "B-R-EP-1") #em Medeiros et al. 2024 BAMU1 foi mantidodel_rowsm_hab <- habitat[!(row.names(habitat) %in%c(del_rows)),]m_hab###PARTICIONANDO TABELA DE GRUPOS----t_grps <- grupos[!(row.names(grupos) %in% del_rows),]##COLUNAS ZERADAS DE DENSIDADE----sum <-colSums(densidade)sumzero_sum <-names(which(colSums(densidade) ==0))zero_sum #nomes das espécies zeradasm_part_cols <- densidade[(colSums(densidade) !=0)] #em != a exclamação inverte o sentidozero_sum2 <-names(which(colSums(m_part_cols) ==0))zero_sum2 #nomes das espécies zeradassum<-colSums(m_part_cols)sum###LINHAS ZERADAS DE DENSIDADE----m_part2 <-as.data.frame(t(m_part_cols))sum <-colSums(m_part2)sumzero_sum <-names(which(colSums(m_part2) ==0))zero_sum #nomes das espécies zeradasm_part_rows <- m_part2[(colSums(m_part2) !=0)] #em != a exclamação inverte o sentidozero_sum2 <-names(which(colSums(m_part_rows) ==0))zero_sum2 #nomes das espécies zeradassum<-colSums(m_part_rows)summ_dens <-as.data.frame((t(m_part_rows)))##SALVANDO MATRIZES FINAIS REMOVIDAS AS LINHAS/COLUNAS ZERADAS----write.table(m_dens, "m_dens.csv",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)write.table(t_grps, "t_grps.csv",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)write.table(m_hab, "m_hab.csv",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)t_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_hab <-read.csv("m_hab.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)#BENTOS2006 - MATRIZ DE CONTAGEM----#####....----library(openxlsx)contagem <-read.xlsx("D:/Elvio/OneDrive/MSS/_Bentos-2006/Bentos2006/zoobentos2006_com.xlsx",rowNames = T,colNames = T,sheet ="contagem")contagem[1:5,1:5] #[1:5,1:5] mostra apenas as linhas e colunas de 1 a 5.###COLUNAS ZERADAS DE CONTAGEM----sum <-colSums(contagem)sumzero_sum <-names(which(colSums(contagem) ==0))zero_sum #nomes das espécies zeradasm_part_cols <- contagem[(colSums(contagem) !=0)] #em != a exclamação inverte o sentidozero_sum2 <-names(which(colSums(m_part_cols) ==0))zero_sum2 #nomes das espécies zeradassum<-colSums(m_part_cols)sum###LINHAS ZERADAS DE CONTAGEM----m_part2 <-as.data.frame(t(m_part_cols))sum <-colSums(m_part2)sumzero_sum <-names(which(colSums(m_part2) ==0))zero_sum #nomes das espécies zeradasm_part_rows <- m_part2[(colSums(m_part2) !=0)] #em != a exclamação inverte o sentidozero_sum2 <-names(which(colSums(m_part_rows) ==0))zero_sum2 #nomes das espécies zeradassum<-colSums(m_part_rows)summ_cont <- (t(m_part_rows))m_contsum(m_cont)ncol(m_cont)##SALVANDO MATRIZ FINAL DE CONTAGEM----write.table(m_cont, "m_cont.csv",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)m_cont <-read.csv("m_cont.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_cont
Code: Correlations of environment data correlations
These variables were then subject to a Principal Component Analysis (PCA) to have their multivariate correlations described across sites. The PCA was conducted on the square root transformed and scaled data using the R package FactoMineR (Lê et al. 2008).
For the macroinvertebrate community, all analyses were performed on density of individuals (ind/m2) calculated by the ratio between the number of individuals and the sampled area of the D-net in each sample. Macroinvertebrate fauna was described by means of average density and rarefied taxa richness, where richness was standardized by the average sample size across all samples. Life stages (larvae, pupae, nymph and adults) were treated as separate taxa, considering their ecological differences (Ross et al. 1987). Comparisons of density and richness among study sites were tested by one-way ANOVA followed by post hoc multiple comparisons using Tukey’s HSD test (α=0.05) (Zar 1999).
A Stepwise Multiple Regression (SMR) analysis using the Akaike Information (MASS R package, stepAIC function, Venables and Ripley (2002)) as the selection criterion was performed in both directions (forward and backward) to identify the best predictors across subsets of variables (site morphology, water quality, sediment composition and marginal habitat structures) for species richness and density (Vendramin et al. 2020).
Variation in species composition among sites were ordinated using Non-metric Multidimensional Scaling (NMDS) (Lanés et al. 2018). Data were arcsine square-root transformed after relativization and the Bray-Curtis distance was calculated. Significance of groups was tested using the Multi-Response Permutation Procedure (MRPP) (Biondini et al. (1985), McCune and Grace (2002)). For all MRPP analyses, the A value is presented as a measure of homogeneity degree in a cluster, as compared to random expectation.
Indicator Species Analysis (ISA) was performed to evaluate species–site associations (indicspecies R package, De Cáceres and Legendre (2009)) using the Indicator, or index, Value (IndVal) (De Cáceres et al. 2010). The Indicator Value was calculated according to the method proposed by Dufrene and Legendre (1997). This value was then tested for statistical significance by means of permutations (999 repetitions). Further explorations of species ecological preferences was performed using Multilevel Pattern Analysis with the multipatt fuction (De Cáceres et al. 2010), based on correlation indices and the Pearson’s phi coefficient of association (Chytrý et al. (2002), De Cáceres and Legendre (2009)). This coefficient is a measure of the correlation between two binary vectors. Phi was based on the presence–absence community matrix (De Cáceres et al. 2010) and corrected for unequal group sizes (Tichy and Chytry 2006).
To summarize the contribution of site environmental variables to species composition we performed distance-based Redundancy Analysis (dbRDA; Legendre and Anderson (1999)), implemented by the vegan R package (Oksanen et al. 2020), between each group of environmental variables (explanatory predictors) (site morphology, water quality, sediment composition and marginal habitat structure) and the community composition density matrix (response matrix) to see which environmental/habitat variables were the most important for determining each benthic community composition (Lanés et al. 2018). Multicollinearity across explanatory (environmental) variables was assessed using the Variance Inflation Factor (VIF) function (vif.cca). Highly collinear variables (VIF > 10) were excluded from the final model. Model significance was evaluated using permutation-based Analysis of Variance (ANOVA) (999 permutations) (Oksanen et al. 2020). To identify a parsimonious model, forward and backward stepwise model selection based on permutation tests was applied, retaining only predictors that significantly explained additional variation in community structure. The final RDA model was re-fitted using the selected variables and re-evaluated for collinearity and significance. Both sequential (Type I) and marginal (Type III) permutation tests were conducted, with marginal effects used to infer the independent contribution of each predictor.
RDA was applied to the density community matrix after being transformed by the Hellinger transformation (Legendre and Gallagher 2001). Site morphology and water quality variables were square root transformed, whereas sediment composition and the marginal habitat structure (which were measured as percentages) were arcsine square-root transformed after relativization by column total (McCune and Grace 2002). Only taxa with the highest absolute scores on the first canonical axis were labeled in the final ordination diagram. All statistical analyses were performed in the statistical environment of R v.2.9.0 (Team 2017).
Redundancy Analysis data organization and transformation
#BENTOS2006 - RDA - ORGANIZANDO DADOS----#####....----##ORGANIZANDO DADOS----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consolet_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_hab_part <-read.csv("m_hab_part.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_trns <-read.csv("m_com_trns.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)library(vegan)library(tidyverse)##SELECIONANDO VARIÁVEIS DE INTERESSE----#Lista as colunascolnames(m_hab_part)# Escolher quais colunas usar por nome##colnames(m_hab_part)[rev(order(colSums(m_hab_part)))] #ordena por maior soma# Usar a função subset()#m_m <- subset(m_hab_part[, c("g.river_length", "g.altitude", "m.depth_hab", "m.depth_max", "m.slope", "m.width")])##m_part <- subset(m_hab_part[, 7:10])##m_part <- m_hab_part[, c("q.w_vel", "q.temp", "q.do", "q.turb")]m_h <- m_hab_part[, grep("^h", colnames(m_hab_part))]colnames(m_h) <-sub("^..", "", colnames(m_h))# Tranformando/relativizandom_q <-sqrt(m_q)m_h <-asin(sqrt(decostand(m_h, method="total", MARGIN =2)))write.table(m_h, "m_h.txt")
3 Results
3.1 Environmental variables
The study sites showed a wide array of environmental features, as expected from their different types (Figure 9). Water flow was present only in smaller streams in lower altitudes (sites 4-6) and during the rainy season (0.10-0.17 m/s). Stream sites tended to be narrower with widths ranging from 5.4 to 29.6 m, compared with 72.2 to 330 m for reservoirs. Littoral depths varied widely from 4.7 to 81.3 cm across all sites. Dissolved oxygen (DO) ranged from 1.8 to 9.4 mg/L and temperatures from 24.0 and 35.2 °C. Turbidity was low with Secchi depths reaching 90 cm, even though some sites had depths as low as 16 cm of the Secchi disk. Mud and sand were the main substrates across all sampled sites (with average covers of 54.4 and 37.9%, respectively). The habitat elements that gave greater overall contributions were attached and filamentous algae (12% on average), littoral grass (9.0%) and aquatic macrophytes (8.4%), but these contributions varied widely across sites and season (Figure 9) (detailed information on the study sites has been published elsewhere by Medeiros et al. (2008) and Medeiros et al. (2024).
Code: Environment data table
#TABELA DE HABITAT----library(dplyr)library(tidyr)m_trab <- m_hab_part %>%rename_with(~gsub("_", ".", .))m <- m_trab %>%group_by(Area = t_grps$area,Habitat = t_grps$ambiente,Ponto = t_grps$UA) %>%summarise(across(where(is.numeric),list(mean = mean, min = min, max = max)),.groups ='drop') %>%pivot_longer(cols =-c(Area, Habitat, Ponto),names_to =c("Variable", ".value"),names_sep ="_" )m <-as.data.frame(m)m_wide <- m %>%mutate(stat_string =ifelse(Variable ==c("q.w.vel"),paste0(round(mean, 3), " (", round(min, 3), "-", round(max, 3), ")"),paste0(round(mean, 1), " (", round(min, 1), "-", round(max, 1), ")"))) %>%unite("Location", Area, Habitat, Ponto, sep ="_") %>%select(Variable, Location, stat_string) %>%pivot_wider(names_from = Location, values_from = stat_string)m_widem_wide <-as.data.frame(m_wide)m_wide# Salvado m_widewrite.table(m_wide, "m_wide_hab.txt",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)m_wide_hab <-read.table("m_wide_hab.txt",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)# Exportando dados para Excellibrary(openxlsx)#write.xlsx(m_wide, file = "tabela de habitat.xlsx", rowNames = FALSE)wb <-loadWorkbook("tabela de habitat.xlsx")writeData(wb, sheet ="Sheet 1", x = m_wide)saveWorkbook(wb, "tabela de habitat.xlsx", overwrite =TRUE)
Code: Environment data variable summary
# Escolher sumário de uma variavelmvar <-"h.roots"m[m$Variable == var, "mean"] #cada valor de varsummary(m[m$Variable == var, "mean"]) #sumário dos valores de var# Escolher sumário de um grupo de variáveis do df mvars <-unique(grep("^h\\.", m$Variable, value =TRUE))summaries <-list() #criam uma lista vazia para guardar os sumários# Loop para cada variável do grupo e guarda em summariesfor (var in vars) { summaries[[var]] <-summary(m[m$Variable == var, "mean"])}#var is a temporary variable used in the for loop to iterate through#each variable name that starts with "h."summariessummary_table <-do.call(rbind, lapply(summaries, as.data.frame.list))round(summary_table, 2)#sink(file = "summary_h.txt", split = TRUE)round(summary_table[order(summary_table$Mean, decreasing =FALSE), ], 2)#sink()# Tabela limpa#summary_table <- cbind(Variable = rownames(summary_table), summary_table)#rownames(summary_table) <- NULL#colnames(summary_table) <- c("Variable", "Min", "Q1", "Median", "Mean", "Q3", "Max")#summary_table
Principal Component Analysis described the overall structure of the study sites and the most important features in separating them in terms of their physical and chemical variables, site morphometry, sediment composition, and marginal habitat structure (Figure 1). PCA explained 41.9% of the variance in the environmental variables, with the first axis (27.6%) showing a gradient from large reservoirs to smaller streams sites. Large scale morphology variables such as altitude, river length and site width were important to describe most reservoirs, whereas river sites were better described by local physical, chemical and habitat variables, such as water velocity, dissolved oxygen, temperature and overhanging vegetation. Larger reservoirs were mostly associated with a muddy substrate whereas river sites included a more diverse array of substrates including sand, rocks and gravel and cobbles. Other variables such as submerged vegetation and leaf litter (habitat structure), and slope (morphology) were important in explaining specific sites at given sampling occasions.
Environment data PCA - fviz
#PCA fviz package----#browseURL("https://www.sthda.com/english/articles/31-principal-component-methods-in-r-practical-guide/112-pca-principal-component-analysis-essentials/")#ORGANIZANDO DADOS----dev.off()rm(list=ls(all=TRUE))cat("\014")t_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_hab_part <-read.csv("m_hab_part.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)colnames(m_hab_part)#fix(m_hab_part)###RELATIVIZAÇÕES E TRANSFORMAÇÕES----m_hab_trns <-sqrt(m_hab_part)#m_hab_trns[m_hab_trns == -Inf] <- 0#m_hab_trns$g.altitude <- NULL #DELETA COLUNA# Salvado m_hab_trnswrite.table(m_hab_trns, "m_hab_trns.csv",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)m_hab_trns <-read.table("m_hab_trns.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)#PCA----library("FactoMineR")library("factoextra")pca <-PCA(m_hab_trns, scale.unit =TRUE, ncp =5, graph =TRUE)print(pca)eig.val <-get_eigenvalue(pca)eig.valsink(file ="cor_matrix.txt", append =FALSE)pca$var$corsink()fviz_eig(pca, addlabels =TRUE, ylim =c(0,30))var <-get_pca_var(pca)varfviz_pca_var(pca, col.var ="black")library("corrplot")corrplot(var$cos2, is.corr=FALSE)fviz_cos2(pca, choice ="var", axes =1:2)fviz_pca_var(pca, col.var ="cos2",gradient.cols =c("#00AFBB", "#E7B800", "#FC4E07"),repel =TRUE# Avoid text overlapping)fviz_pca_var(pca, alpha.var ="cos2")corrplot(var$contrib, is.corr=FALSE)fviz_contrib(pca, choice ="var", axes =1, top =10)fviz_contrib(pca, choice ="var", axes =2, top =10)fviz_contrib(pca, choice ="var", axes =1:2, top =10)fviz_pca_var(pca, col.var ="contrib",gradient.cols =c("#00AFBB", "#E7B800", "#FC4E07"))fviz_pca_var(pca, alpha.var ="contrib")##GROUPING BY KMEANS----set.seed(123)res.km <-kmeans(var$coord, centers =3, nstart =25)grp <-as.factor(res.km$cluster)# Color variables by groupsfviz_pca_var(pca, col.var = grp,palette =c("#0073C2FF", "#EFC000FF", "#868686FF"),legend.title ="Cluster")ind <-get_pca_ind(pca)indind$contrib##BIPLOTS----fviz_pca_ind(pca)fviz_pca_ind(pca, col.ind ="cos2",gradient.cols =c("#00AFBB", "#E7B800", "#FC4E07"),repel =TRUE#avoid text overlapping (slow if many points))fviz_pca_ind(pca, pointsize ="cos2",pointshape =21, fill ="#E7B800",repel =TRUE#avoid text overlapping (slow if many points))fviz_pca_ind(pca, col.ind ="cos2", pointsize ="cos2",gradient.cols =c("#00AFBB", "#E7B800", "#FC4E07"),repel =TRUE#avoid text overlapping (slow if many points))fviz_cos2(pca, choice ="ind")fviz_contrib(pca, choice ="ind", axes =1:2)# Create a random continuous variable of length 23,# Same length as the number of active individuals in the PCAnrow(m_hab_part)set.seed(123)my.cont.var <- t_grps$ponto.nmy.cont.var# Color individuals by the continuous variablefviz_pca_ind(pca, col.ind = my.cont.var,gradient.cols =c("blue", "yellow", "red"),legend.title ="Cont.Var",repel =FALSE)fviz_pca_ind(pca,geom.ind ="point", #show points only (nbut not "text")col.ind =as.factor(my.cont.var), #color by groupspalette ="grey",addEllipses =TRUE, #concentration ellipseslegend.title ="Groups")length(unique(my.cont.var))fviz_pca_ind(pca,geom.ind ="point",col.ind = t_grps$UA,palette ="grey",addEllipses =TRUE, ellipse.type ="confidence",mean.point =TRUE,legend.title ="Groups")fviz_pca_biplot(pca, repel =TRUE,col.var ="#2E9FDF", #variables colorcol.ind ="#696969"#individuals color)fviz_pca_biplot(pca,col.ind = t_grps$UA, palette ="jco",addEllipses =TRUE, elipse.type ="euclid",label ="var",col.var ="black", repel =TRUE,legend.title ="Species")#GRAFICO FINAL----fviz_pca_biplot(pca, repel =TRUE,col.var ="red", #variables colorcol.ind ="black"#individuals color)library(factoextra)library(ggplot2)# Convert t_grps$UA to a factor to ensure distinct shapest_grps$UA <-as.factor(t_grps$UA)# Generate the PCA biplot with individuals in blackfviz_pca_biplot(pca, repel =TRUE,col.var ="red", #cariables colorcol.ind ="black", #all individuals in blackgeom =c("point", "text"))# Convert t_grps$UA to a factor to ensure distinct shapest_grps$UA <-as.factor(t_grps$UA)# Generate the PCA biplot with individuals in blackfviz_pca_biplot(pca, repel =TRUE,col.var ="red", #variables colorcol.ind ="black", #all individuals in blackgeom =c("text")) +#remove default individualsgeom_point(aes(shape = t_grps$UA), size =3) +#map shapes to t_grps$UAscale_shape_manual(values =c(15, 16, 0, 1, 2, 17)) +#adjust shape valuestheme_minimal() #clean theme###FIX VARIABLES COORDINATES----var_coords <-as.data.frame(pca$var$coord)# Manually adjust positions (modify values as needed)#var_coords$Dim.1 <- var_coords$Dim.1 * 4.7 #shift right#var_coords$Dim.2 <- var_coords$Dim.2 * 4.7 #shift up#Fine tunning#fix(var_coords)#write.table(var_coords, "coords_var.txt")var_coords <-read.table("coords_var.txt")#rownames(var_coords) <-# substr(rownames(var_coords), 3, nchar(rownames(var_coords))) #remove os 2 primeiros caracteres###FIX INDIVIDUALS COORDINATES----ind_coords <-as.data.frame(pca$ind$coord)# Manually adjust positions (modify values as needed)#ind_coords$Dim.1 <- ind_coords$Dim.1 + 0 #shift right#ind_coords$Dim.2 <- ind_coords$Dim.2 + 0.2 #shift up#Fine tunning#fix(ind_coords)#write.table(ind_coords, "coords_ind.txt")ind_coords <-read.table("coords_ind.txt")#rownames(ind_coords) <-# substr(rownames(ind_coords), 5, nchar(rownames(ind_coords))) #remove os 4 primeiros caracteres#PLOT----png("fig-hab_PCA_fviz.png", width =11, height =9, units ="in", res =400)fviz_pca_biplot(pca, repel =FALSE,col.var ="red", col.ind ="black",geom ="none", #remove default labelslabel ="none") +#remove both variable & individual labels# Add manually adjusted variable labelsgeom_text(data = var_coords, aes(x = Dim.1, y = Dim.2, label =rownames(var_coords)),color ="red", size =4) +# Add manually adjusted individual labelsgeom_text(data = ind_coords, aes(x = Dim.1, y = Dim.2, label =rownames(ind_coords)),color ="black", size =4, nudge_x =0, nudge_y =0) +#corrected column namesgeom_point(aes(shape = t_grps$UA), size =3) +#map shapes to t_grps$UAscale_shape_manual(values =c(15, 16, 0, 1, 2, 17)) +theme_minimal() +ggtitle(NULL) +labs(shape ="Sites") +theme(legend.text =element_text(size =14), #increase legend text sizelegend.title =element_text(size =16)) +#increase legend title sizeguides(shape =guide_legend(override.aes =list(size =4))) #increase legend symbolsdev.off()
3.2 Benthic macroinvertebrates
A total amount of 28155 individuals was collected, divided into 32 taxonomic groups. Out of these, Thiaridae (1340.15 ind./m2 ± 3068.95), larvae of Chironomidae (629.07 ind./m2 ± 1424.72), and Oligochaeta (380.49 ind./m2 ± 856.62) were the most the representative (Figure 10).
Community structure data table
#BENTOS2006 - ESTR. DA COMUNIDADE----#####....----##ORGANIZANDO DADOS----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consolet_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_hab_part <-read.csv("m_hab_part.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_cont <-read.csv("m_cont.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)library(tibble)library(tidyverse)library(forcats)library(openxlsx)library(Rcpp)m_densstr(m_dens)m_trab <- m_densm_trab <- (m_trab) #SITESm_trab <-t(m_trab) #SPP##TAMANHO DA MATRIZ----range(m_trab) #menor e maior valoreslength(m_trab) #no. de colunasncol(m_trab) #no. de N colunasnrow(m_trab) #no. de M linhassum(lengths(m_trab)) #soma os nos. de colunaslength(as.matrix(m_trab)) #tamanho da matriz m x nsum(m_trab ==0) #número de observações igual a zerosum(m_trab >0) #número de observações maiores que zero#calculando a proporção de zeros na matrizzeros <- (sum(m_trab ==0)/length(as.matrix(m_trab)))*100zeros##ESTRUTURA DA COMUNIDADE----Sum <-rowSums(m_trab)#ouSum <-apply(m_trab,1,sum)Sum# Abundância relativa (%)RA <- (Sum /sum(Sum)) *100# percentage# MediaMean <-rowMeans(m_trab)Mean# OuMean <-apply(m_trab,1,mean)Mean# Desvio padrãoDP <-apply(m_trab,1,sd)DP# MáximoMax <-apply(m_trab,1,max)Max# MínimoMin <-apply(m_trab,1,min)Min# Mínimo não-zeroMinZ <-apply(m_trab, 1, function(row) { non_zero_values <- row[row >0] # Filter out zero valuesif (length(non_zero_values) ==0) {return(0) # If all values are zero, return 0 } else {return(min(non_zero_values)) # Return the minimum of non-zero values }})MinZm_pa <- m_trabm_pa[m_pa !=0] <-1rowSums(m_pa)library(vegan)bin <-decostand(m_trab,"pa")bin[1:10, 1:10]S <-apply(bin,1,sum)S#OURiqueza <-specnumber(m_trab)RiquezaRiqueza_total <-specnumber(colSums(m_trab))Riqueza_totalFO <-rowSums(m_trab >0) /ncol(m_trab) *100FOH <-diversity(m_trab, index ="shannon")HD <-diversity(m_trab, "simpson")DD[is.na(D)] <-0#substitui NA ou NaN por 0DE <- H/log(specnumber(m_trab))EE[is.na(E)] <-0#substitui NA ou NaN por 0E###DESCRITORES----Descritores1 <-cbind(Sum, RA, Mean, DP, Max, Min, MinZ, FO, S, E, H, D)Descritores1 <-as.data.frame(Descritores1)Descritores1Descritores1 <- Descritores1[order(-Descritores1$Mean), ] #ordena do maio para o menor "-"#Descritores1 <- Descritores1 %>% rownames_to_column(var="Espécies") #da nome a primeira colunaSomaTotalD <-apply(Descritores1,2,sum)SomaTotalDMediaTotalD <-apply(Descritores1,2,mean)MediaTotalDDPTotalD <-apply(Descritores1,2,sd)DPTotalDDescritores2 <-cbind(SomaTotalD, MediaTotalD, DPTotalD)Descritores2 <-as.data.frame(Descritores2)Descritores2 <-t(Descritores2)Descritores2DescritoresFinal <-rbind(Descritores1, Descritores2)DescritoresFinalDescritoresFinal <-round (DescritoresFinal, 2)DescritoresFinal# Salvando DescritoresFinalwrite.table(data.frame("Spp"=rownames(DescritoresFinal), DescritoresFinal),"DescritoresSPP.txt",row.names=FALSE,sep="\t")###TABELA FINAL DESCRITORES----library(gt)df <- DescritoresFinalncol(df); nrow(df) #no. de N colunas x M linhasdf <-cbind(Spp =rownames(df), df)gt <-gt(df, rowname_col ="Espécie",caption ="Descritores da diversidade por espécie (colunas). Sum, soma; RA, abundância relativa (%); mean, média; DP, desvio padrão da média; Max, maior valor; Min, menor valor; MinZ, menor valor não zero; FO, frequência de ocorrência (%); S, riqueza (ou no. de ocorrências, da matriz transposta); E, índice de equitabilidade de Pielou; H, índice de diversidade de Shannon; D, índice de diversidade de Simpson.")gtgtsave(gt, "DescritoresSPP.html")gtsave(gt, "DescritoresSPP.png")###NORMALIDADE----library(moments)Assimetria <-apply(m_trab,1,skewness)AssimetriaCurtose <-apply(m_trab,1,kurtosis)CurtoseNormalidade1 <-cbind(Assimetria, Curtose)Normalidade1 <-as.data.frame(Normalidade1)Normalidade1SomaTotalN <-apply(Normalidade1,2,sum)SomaTotalNMediaTotalN <-apply(Normalidade1,2,mean)MediaTotalNDPTotalN <-apply(Normalidade1,2,sd)DPTotalNNormalidade2<-cbind(SomaTotalN, MediaTotalN, DPTotalN)Normalidade2<-as.data.frame(Normalidade2)Normalidade2 <-t(Normalidade2) #"t" transpoe a matrizNormalidade2NormalidadeFinal <-rbind(Normalidade1, Normalidade2)NormalidadeFinalNormalidadeFinal <-round(NormalidadeFinal, 2)NormalidadeFinal# Salvando NormalidadeFinalwrite.table(data.frame("Spp"=rownames(NormalidadeFinal), NormalidadeFinal),"NormalidadeSPP.txt",row.names=FALSE,sep="\t")###TABELA FINAL NORMALIDADE----nf <- NormalidadeFinalncol(nf); nrow(nf) #no. de N colunas x M linhasnf <-cbind(Spp =rownames(nf), nf)gt <-gt(nf, rowname_col ="Site", caption ="Descritores da normalidade por espécie (coluna)")gtgtsave(gt, "NormalidadeSPP.html")#DIVERSIDADE EM DENSIDADE MATRIZ NÃO-TRANSPOSTA----# Números de HillHill <-renyi(m_trab,scales =c(0:5),hill =TRUE)Hilllibrary(ggplot2)library(tidyr)library(tidyverse)grafico1 <- Hill %>%rownames_to_column() %>%pivot_longer(-rowname) %>%mutate(name =factor(name, name[1:length(Hill)])) %>%ggplot(aes(x = name, y = value, group = rowname,col = rowname)) +geom_point(size =2) +geom_line(size =1) +xlab("Parâmetro de ordem de diversidade (q)") +ylab("Diversidade") +labs(col ="Locais") +theme_bw() +theme(text =element_text(size =10)) #ajustar a fonte caso nao caiba no output htmlgrafico1ggsave(grafico1, dpi =300, filename ="fig-hill.png")# Gráfico de abunânciaabund <-colSums(m_trab)#abund <- m_trab[1, ] #escolhe a primeira linha para a distribuição de abundânciadf <-data.frame(sp =colnames(m_trab),abun = abund)grafico2 <-ggplot(df, aes(fct_reorder(sp, -abun), abun, group =1)) +geom_col() +geom_line(col ="red", linetype ="dashed") +geom_point(col ="red") +xlab("Espécies") +ylab("Abundância") +theme_bw() +theme(axis.text.x =element_text(angle =45,hjust =1,face ="italic",size =10),axis.text.y =element_text(size =10),axis.title.x =element_text(size =12),axis.title.y =element_text(size =12))grafico2ggsave(grafico2, dpi =300, filename ="fig-abun.png")# Curvas de rarefaçãom_trab_r <-mutate(as.data.frame(m_trab), across(everything(), ceiling)) #arredonda a matriz original para números inteiroscolSums(t(m_trab_r))colnames(t(m_trab_r))#rarefa <- t(m_trab_r[c("S-R-CT2","S-R-CP2","S-A-TA2","B-A-MU2"),]) #curva de rarefação para os sítios especificadosclass(m_trab)# Curva de rarefação para todos os sítiosrarefa <-t(m_trab_r)## Curva de rarefação para as 8 UA's com maior soma#m_trab_r <- as.data.frame(t(m_trab_r)) #transpõe a matriz#col_sums <- colSums(m_trab_r)#largest_columns <- names(sort(col_sums, decreasing = TRUE)[1:8])#rarefa <- m_trab_r[largest_columns] #curva de rarefação para as 8 UA's com maior somalibrary(iNEXT)out <-iNEXT(rarefa, q =0,datatype ="abundance",size =NULL,endpoint =2000, #define o comprimento de eixo xknots =40,se =TRUE,conf =0.95,nboot =50)grafico3 <-ggiNEXT(out, type =1, facet.var="None") +theme_bw() +labs(fill ="Áreas") +xlab("Número de indivíduos") +ylab("Riqueza de espécies") +theme(legend.title=element_blank()) #ver como fica com facet.var="Assemblage"#grafico3#ggsave(grafico3, dpi = 300, filename = "fig-rare1.png")# Curvas de acumulação de espécesacumula <-specaccum(m_trab,method ="random")acumulaplot(acumula)plot_data <-data.frame("UAs"=c(0, acumula$sites),"Riqueza"=c(0, acumula$richness),"lower"=c(0, acumula$richness - acumula$sd),"upper"=c(0, acumula$richness + acumula$sd))gLocais <-ggplot(plot_data, aes(x = UAs, y = Riqueza)) +geom_point(color ="blue", size =4) +geom_line(color ="blue", lwd =2) +geom_ribbon(aes(ymin = lower, ymax = upper),linetype=2, alpha=0.3, fill ="yellow") +ylab("Riqueza acumulada") +theme_classic() +theme(text =element_text(size =16))gLocaisggsave(gLocais, dpi =300, filename ="fig-acum.png")plot_data <-data.frame("Individuos"=c(0, acumula$individuals),"Riqueza"=c(0, acumula$richness),"lower"=c(0, acumula$richness - acumula$sd),"upper"=c(0, acumula$richness + acumula$sd))gInd <-ggplot(plot_data, aes(x = Individuos, y = Riqueza)) +geom_point(color ="blue", size =4) +geom_line(color ="blue", lwd =2) +geom_ribbon(aes(ymin = lower, ymax = upper),linetype=2, alpha=0.3, fill ="yellow") +ylab("Riqueza acumulada") +theme_classic() +theme(text =element_text(size =16))gInd##TABELA DA COMUNIDADE----library(dplyr)library(tidyr)m_trab <- m_dens %>%rename_with(~gsub("_", ".", .))m <- m_trab %>%group_by(Area = t_grps$area,Habitat = t_grps$ambiente,Ponto = t_grps$UA) %>%summarise(across(where(is.numeric),list(mean = mean, sd = sd)),.groups ='drop') %>%pivot_longer(cols =-c(Area, Habitat, Ponto),names_to =c("Variable", ".value"),names_sep ="_" )mm <-as.data.frame(m)m_wide <- m %>%mutate(stat_string =ifelse( Variable =="nome_da_variavel", #escolhe uma variável pra ter 3 casas decimaispaste0(round(mean, 3), " (", round(sd, 3), ")"),paste0(round(mean, 1), " (", round(sd, 1), ")") )) %>%unite("Location", Area, Habitat, Ponto, sep ="_") %>% dplyr::select(Variable, Location, stat_string) %>%#dplyr dá conflito com MASSpivot_wider(names_from = Location, values_from = stat_string)m_widem_wide <-as.data.frame(m_wide)m_wide# Salvado m_widewrite.table(m_wide, "m_wide_spp.txt",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)m_wide_spp <-read.table("m_wide_spp.txt",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)# Exportando dados para Excellibrary(openxlsx)#write.xlsx(m_wide_spp, file = "tabela da comunidade.xlsx", rowNames = FALSE)wb <-loadWorkbook("tabela da comunidade.xlsx")writeData(wb, sheet ="Sheet 1", x = m_wide_spp)saveWorkbook(wb, "tabela da comunidade.xlsx", overwrite =TRUE)
Code: Community structure variable summary
# Escolher sumário de uma variavelmvar <-"h.roots"m[m$Variable == var, "mean"] #cada valor de varsummary(m[m$Variable == var, "mean"]) #sumário dos valores de var# Escolher sumário de um grupo de variáveis do df mvars <-unique(grep("^h\\.", m$Variable, value =TRUE))summaries <-list() #criam uma lista vazia para guardar os sumários# Loop para cada variável do grupo e guarda em summariesfor (var in vars) { summaries[[var]] <-summary(m[m$Variable == var, "mean"])}#var is a temporary variable used in the for loop to iterate through#each variable name that starts with "h."summariessummary_table <-do.call(rbind, lapply(summaries, as.data.frame.list))round(summary_table, 2)#sink(file = "summary_h.txt", split = TRUE)round(summary_table[order(summary_table$Mean, decreasing =FALSE), ], 2)#sink()# Tabela limpa#summary_table <- cbind(Variable = rownames(summary_table), summary_table)#rownames(summary_table) <- NULL#colnames(summary_table) <- c("Variable", "Min", "Q1", "Median", "Mean", "Q3", "Max")#summary_table
Sampling sites were different in macroinvertebrate rarefied richness (ANOVA d.f.= 5, 16; Frichness = 6.9; p=0.001) but not in density of individuals (ANOVA d.f.= 5, 16; Fdensity = 2.06; p = 0.124). According to post hoc Turkey tests, Cipó stream and the Recando reservoir had significantly higher rarefied richness when compared to the other study sites (p < 0.05) (Figure 2).
Community structure univariate comparisons
#BENTOS2006 - RAREFAÇÃO E ANOVAS----#####....----#ORGANIZANADO DADOS----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consolem_cont <-read.csv("m_cont.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_cont##RAREFAÇÃO BASEADA EM CONTAGEM----library(vegan)data <- m_contS <-specnumber(data) #observed number of speciesS#raremax <- min(rowSums(data))raremax <-round(mean(rowSums(data)))#raremax <- 1000raremaxSrare <-rarefy(data, raremax)Srareplot(S, Srare, xlab ="Observed No. of Species", ylab ="Rarefied No. of Species")abline(0, 1)png("fig-rarecurve.png")rarecurve(data, step =20, sample = raremax, col ="blue", cex =0.6,xlab ="Sample size", ylab ="Taxa", xlim =c()) #c(0,5000)dev.off()##ANOVAS----anovas <-cbind(Srare = Srare, m_cont)anovas <- anovas[,1, drop =FALSE]anovasm_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_densanovas$dens <-rowMeans(m_dens, na.rm =TRUE)anovasS <-specnumber(m_dens)anovas$S <- SanovasS###CRIANDO GRUPOS----library(tidyverse)anovas <-cbind(Grupos =rownames(anovas), anovas)anovasgrps <-substr(anovas[, 1], 5,6)grpsanovas <-mutate(anovas, Grupos =c(grps))anovas###SALVANDO MATRIZ FINAL ANOVAS----write.table(anovas, "t_anovas.csv",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)anovas <-read.csv("t_anovas.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)anovas##ANOVA----sink(file ="anovas.txt", split = T)anova <-aov(Srare ~ Grupos, data = anovas)anovasummary(anova)TukeyHSD(anova)###RESUMO DO TUKEY----tukey <-TukeyHSD(anova)tukey_df <-as.data.frame(tukey$Grupos)tukey_df$Comparison <-rownames(tukey_df)significant <-subset(tukey_df, `p adj`<0.05)print(significant)sink()##ANOVA PLOT----levels(anovas$Grupos) #to check existing levelsanovas$Grupos <-factor(anovas$Grupos,levels =c("MU", "SA", "EP", "RE", "CI", "SE")) #define a ordem#anovas %>%# group_by(Grupos) %>%# summarise(# count = n(),# mean = mean(Srare, na.rm = TRUE),# sd = sd(Srare, na.rm = TRUE))rareplot <-ggplot(anovas, aes(x = Grupos, y = Srare)) +geom_jitter(width =0.2, alpha =0.6, color ="black") +# jittered points (small)stat_summary(fun = mean, geom ="point", size =4, color ="black") +# large mean pointstat_summary(fun.data = mean_se, geom ="errorbar", width =0.1, color ="black") +# SE barslabs(x ="Study sites", y ="Rarefied richness", title ="") +theme_minimal()rareplotggsave(rareplot, dpi =300, filename ="fig-rarerich.png", bg ="white")
The Stepwise Multiple Regression Model (SMRM) was applied separately for each of the four sets of environmental variables (morphology, water quality, sediment composition and marginal habitat structures). Site morphology variables retained only stream width, with the final model explaining 15.4% of the variation (Adjusted R2 = 0.111; d.f. = 1,20; F = 3.63, p = 0.071). Site width β = -0.36 (± 0.19), indicates that wider sites tend to have fewer species, although this effect was marginally significant (p = 0.071). River length had a strong negative effect on species richness (β = -1.96 ± 0.34), explaining 63.1% of its variation (Adjusted R2 = 0.612; d.f. = 1, 20; F = 34.15, p < 0.001). The SMRM that best explained the water quality variables (42.3% of the variation in rarefied species richness) included water velocity and water temperature as significant predictors (Adjusted R2 = 0.288; d.f. = 4,17; F = 3.12, p = 0.043). Water velocity (β = -19.97 ± 9.04) had a significant negative effect on species richness (p = 0.041) and water temperature (β = 16.11 ± 4.97) had a significant positive effect (p = 0.005). Dissolved oxygen and turbidity were not significant predictors (p > 0.05). Among the substrate composition variables, gravel was the only significant predictor, explaining 32.8% of the variation in rarefied species richness (Adjusted R2 = 0.294; d.f. = 1,20; F = 9.76, p = 0.005, β = 1.85 ± 0.59). The SMRM for the habitat variables explained 58.6% of the variation in rarefied species richness (Adjusted R2 = 0.379; d.f. = 7,14; F = 2.83, p = 0.046), with leaf litter (β = 6.24 ± 2.33, p = 0.018) and algal cover (β = 0.92 ± 0.41, p = 0.039) having significant positive effects on species richness. Other predictors such as roots (β = 4.71, p = 0.058) and littoral grass (β = -1.01, p = 0.052) may have been important showing marginal effects. Stepwise Multiple Regression was not performed between the environmental variables and species densities because the latter did not vary significantly across sites and sampling occasions.
Community structure multiple regression models
#BENTOS2006 - REGRESSÕES----#####....----#CORRELAÇÃO--------dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consoleanovas <-read.csv("t_anovas.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)ggplot(anovas, aes(x = dens, y = Srare)) +geom_point(size =3, color ="#00AFBB") +geom_smooth(method ="lm", color ="#FC4E07", se =TRUE) +labs(x ="dens",y ="Srare",title ="Correlação entre dens e Srare" ) +theme_minimal()cor.test(anovas$dens, anovas$Srare, method ="pearson")#STEPWISE REGRESSION----#browseURL("https://www.r-bloggers.com/2023/12/a-complete-guide-to-stepwise-regression-in-r/")#browseURL("https://www.sthda.com/english/articles/37-model-selection-essentials-in-r/154-stepwise-regression-essentials-in-r/")library(tidyverse)library(caret)library(leaps)library(MASS)##ORGANIZANDO OS DADOS----m_hab_part <-read.csv("m_hab_part.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)# Select only columns starting with "m."colnames(m_hab_part)vars <-grep("^(m)\\.", colnames(m_hab_part), value =TRUE) #^(g|m) mais de uma inicialvarsm_vars <-sqrt((m_hab_part[, vars])) #TRANSFORMAÇÃOm_varsdata <-cbind(Srare = anovas$Srare, m_vars)data# Fit the full modelfull.model <-lm(Srare ~., data = data)# Stepwise regression modelstep.model <-stepAIC(full.model, direction ="both", #both, backward, forwardtrace =FALSE)summary(step.model)models <-regsubsets(Srare ~., data = data, nvmax =5, #maximal number of predictors to incorporate in the modelmethod ="seqrep") #sequential replacement, combination of forward and backward selectionssummary(models)# Train modelset.seed(123)# set up repeated k-fold (number=10) cross-validation ("cv")train.control <-trainControl(method ="cv", number =10)# Train the modelstep.model <-train(Srare ~., data = data,method ="leapBackward",tuneGrid =data.frame(nvmax =1:5),trControl = train.control)#**You can't directly use two different data frames in train(), because train()#(from caret package) expects all predictor (X) and response (Y) variables#to be in the same data frame.step.model$resultsstep.model$bestTunesummary(step.model$finalModel)nvmax <- step.model$bestTune$nvmaxnvmaxcoef(step.model$finalModel, nvmax)lm_model <-lm(Srare ~ m.slope + m.width,data = data)summary(lm_model)colnames(m_vars)###ALGUNS GRÁFICOS----par(mfrow =c(2,2))plot(step.model)varImp(step.model)#library(car)#crPlots(step.model)#par(mfrow = c(1,1))
Ordination analysis revealed a wide variation in community composition across sampling occasions with some degree of similarity within sites. The NMDS resulted in a non-metric fit (R2) of 0.97, with a stress of 0.173 (Figure 3).
Community data Classification and Ordination analysis
#BENTOS2006 - NMDS----#####....----##ORGANIZANDO DADOS----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consolet_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_hab_part <-read.csv("m_hab_part.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_cont <-read.csv("m_cont.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_densstr(m_dens)m_trab <- m_dens#CLASSIFICAÇÃO: Matriz Comunitária----library(vegan)m_trns <-asin(sqrt(decostand(m_trab,method="total", MARGIN =2)))#m_trns <- sqrt(m_trab)# Salvando a matriz transformadawrite.table(m_trns, "m_com_trns.csv",sep =";", dec =".", #"\t",row.names =TRUE,quote =TRUE,append =FALSE)m_dist <-vegdist(m_trns, method ="bray",diag =TRUE,upper =FALSE)save(m_dist, file ="m_dist.RData")load("m_dist.RData")cluster_uas <-hclust(m_dist, method ="average")png("fig-cluster.png", width =9, height =7, units ="in", res =400)plot(cluster_uas, main ="Cluster Dendrogram - Bray-Curtis da Matriz Comunitária",hang =0.1) #testar com -.01#rect.hclust(cluster_uas, k = 3, h = NULL)#h = 0.8 fornece os grupos formados na altura hdev.off()as.matrix(m_dist)[1:6, 1:6]# Heatmaplibrary("gplots")heatdist <-as.matrix(m_dist)col <-rev(heat.colors(999)) #rev() reverte as cores do heatmappng("fig-heatmap_uas.png", width =9, height =7, units ="in", res =400)heatmap.2(x=(as.matrix(m_dist)), #objetos x objetosRowv =as.dendrogram(cluster_uas),Colv =as.dendrogram(cluster_uas),key = T, tracecol =NA, revC = T,col = heat.colors, #dissimilaridade = 1 - similaridadedensity.info ="none",xlab ="UA´s", ylab ="UA´s",mar =c(6, 6) +0.2)dev.off()cluster_spp <-hclust((m_dist(t(m_trns), method ="bray",diag =TRUE,upper =FALSE)), method ="average")png("fig-cluster_spp.png", width =9, height =7, units ="in", res =400)plot (cluster_spp, main ="Dendrograma dos atributos")dev.off()png("fig-heatmap.png", width =9, height =9, units ="in", res =400)heatmap.2(t(as.matrix(m_trns)), #objetos x atributosColv =as.dendrogram(cluster_uas),Rowv =as.dendrogram(cluster_spp),key = T, tracecol =NA, revC = T,col = col,density.info ="none",xlab ="Unidades amostrais", ylab ="Espécies",mar =c(6, 6) +1.8) # adjust margin sizedev.off()# Histórico das fusõeslibrary(tidyverse)library(gt)merge <-as.data.frame(cluster_uas$merge)merge[nrow(merge)+1,] =c("0","0")height <-as.data.frame(round(cluster_uas$height, 2))height[nrow(height)+1,] =c("1.0")fusoes <-data.frame(Cluster_uas = merge, Height = height)colnames(fusoes) <-c("Cluster1", "Cluster2", "Height")UA <-rownames_to_column(as.data.frame(m_trns[, 0]))colnames(UA) <-c("UAs")No.UA <-1:nrow(fusoes)fusoes <-cbind(No.UA, UA, fusoes)fusoes$Histórico <-1:nrow(fusoes)#fusoesfusoes <-gt(fusoes)fusoesgtsave(fusoes, "Fusoes.html")#ORDENAÇÃO----nmds <-metaMDS(m_trns, #matriz CPE, definida anteriormentek=2) #no. de redução de dimensõesnmdsset.seed(321)nmds <-metaMDS(m_trns, distance ="bray", k =2, trymax =100)#method = "manhattan", "euclidean", "canberra", "clark", "bray", "kulczynski", "jaccard", "gower", #"altGower", "morisita", "horn", "mountford", "raup", "binomial", "chao", "cao", "mahalanobis", "chisq", #"chord", "hellinger", "aitchison", or "robust.aitchison"nmdsscores(nmds)stressplot(nmds)gof <-goodness(nmds)gof##GRAFICOS DE ORDENAÇÃO----plot(nmds, display ="sites", type ="n")points(nmds, display ="sites", cex =2*gof/mean(gof))plot(nmds)ordiplot(nmds, choices =c(1,2), type ="n")orditorp(nmds, display ="species", col="red", air=0.01)orditorp(nmds, display ="sites", cex=1.25, air=0.01)plot(nmds)colnames(t_grps)grupos <- t_grps$UAgrupos###GRÁFICO FINAL----colors <- dplyr::recode(grupos, #conflito com outro pacote(?)"CI"="grey","SA"="black","MU"="black","EP"="grey","RE"="black","SE"="grey",.default ="black")colorspng("fig-NMDS.png", width =9, height =7, units ="in", res =400)ordiplot(nmds, type ="n")ordihull(nmds, groups = grupos, draw ="polygon",col="grey90", label =TRUE)#orditorp(nmds, display = "species", col = "red", air = 0.01)#orditorp(nmds, display = "sites", col = colors, air = 0.01, cex = 1.25)#points(nmds, display = "sites", pch = 21, col = "black", bg = colors, cex = 1.25)#sites_coords <- scores(nmds, display = "sites")sites_coords <-read.table("nms_sites_coords.txt")#text(sites_coords[,1], sites_coords[,2],# labels = rownames(sites_coords),# col = colors, cex = 1, pos = 3) # pos = 3 puts labels above#fix(sites_coords)#write.table(sites_coords, "nms_sites_coords.txt")dev.off()# Mais gráficos----par(mfrow=c(2,2))ordiplot(nmds, type ="n")orditorp(nmds, display ="species", col ="red", air =0.01)orditorp(nmds, display ="sites", col = colors, air =0.01, cex =1.25)ordihull(nmds, groups = grupos, draw ="polygon", col ="grey90", label =TRUE)ordiplot(nmds, type ="n")orditorp(nmds, display ="species", col ="red", air =0.01)orditorp(nmds, display ="sites", col = colors, air =0.01, cex =1.25)ordiellipse(nmds, groups = grupos, display ="sites", draw ="polygon", col ="grey90", label = T)ordibar(nmds, grupos, display ="sites")ordiplot(nmds, type ="n")orditorp(nmds, display ="species", col ="red", air =0.01)orditorp(nmds, display ="sites", col = colors, air =0.01, cex =1.25)ordispider(nmds, grupos, display="sites")ordiplot(nmds, type ="n")orditorp(nmds, display ="species", col ="red", air =0.01)orditorp(nmds, display ="sites", col = colors, air =0.01, cex =1.25)ordicluster(nmds, cluster_uas, prune =0, display ="sites")par(mfrow=c(1,1))plot(nmds)ordiplot(nmds, type ="n")orditorp(nmds, display ="species", col ="red", air =0.01)orditorp(nmds, display ="sites", col = colors, air =0.01, cex =1.25)ordicluster(nmds, hclust(m_dist(m_trns, "bray")))#?ordiclusterplot(nmds)ordiplot(nmds, type ="n")# Plot convex hulls with colors baesd on treatmentfor(i inunique(grupos)) {ordihull(nmds$point[grep(i,grupos),],draw="polygon",groups=grupos[grupos==i],col=colors[grep(i,grupos)],label=F) }orditorp(nmds, display ="species", col="red", air =0.01)orditorp(nmds, display ="sites", col = colors, air =0.01, cex =1.25)m <-nrow(m_trns)surf1 <-seq(1, m)surf2 <-c(cluster_uas$height, 1)surf3 <- m_hab_part$m.widthknots <-nrow(m_trns)plot(nmds)ordisurf(nmds, surf1, main ="", col ="forestgreen", knots = knots)ordisurf(nmds, surf2, main ="", add =TRUE, col ="green", knots = knots)orditorp(nmds, display ="sites", col ="grey30", air =0.1, cex =1)orditorp(nmds, display ="species", col ="grey30", air =0.1, cex =1)legend ('topleft', col =c('forestgreen', 'green'), lty =1, legend =c('surf1', 'surf2'))
Spatial segregation in macroinvertebrate composition was confirmed by MRPP, which shows significant differences between sites (A = 0.14, p = 0.001), but not for sampling months (A=-0.01, p=0.815) or seasons (wet/dry) (A < 0.01, p = 0.413). Differences between reservoir and stream sites were also significant (A = 0.03, p = 0.004).
Paired multiple comparison analyses
#BENTOS2006 - MRPP----#####....----##ORGANIZANDO DADOS----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consolet_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_hab_part <-read.csv("m_hab_part.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_cont <-read.csv("m_cont.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_trab <- m_denslibrary(vegan)m_trns <-asin(sqrt(decostand(m_trab,method="total", MARGIN =2))) #see hellinger#m_trns <- sqrt(m_trab)m_trab <- (m_trns)t_grps#TRABALHANDO COM MULTIPLAS VARIAVEIS----# Run MRPP for multiple variablesfatores <-c("UA", "mes", "estacao", "area", "ambiente")set.seed(123)mrpp_results <-lapply(fatores, function(v) { grp <-as.factor(t_grps[[v]])mrpp(dat = m_trab,grouping = grp,distance ="bray" )})names(mrpp_results) <- fatoresmrpp_results# Print one resultmrpp_results$UA# Loop through summariessink(file ="mrpp.txt", split =TRUE)#, append = TRUE)for (v innames(mrpp_results)) {cat("\n============================\n")cat("MRPP for:", v, "\n")print(mrpp_results[[v]])}sink()# Extract key statistics into a tablemrpp_summary <-data.frame(Variable = fatores,Delta =sapply(mrpp_results, function(x) x$delta),Expected_Delta =sapply(mrpp_results, function(x) x$E.delta),A =sapply(mrpp_results, function(x) x$A),P_value =sapply(mrpp_results, function(x) x$Pvalue))sink("mrpp.txt", append =TRUE)mrpp_summarysink()##TRABALHANDO COM UMA UNICA VARIAVEL----# Run MRPP for a single variablegrp <-as.factor(t_grps$UA) #UA, mes, estacao, area, ambienteset.seed(123)mrpp <-mrpp(dat = m_trab, grouping = grp, distance ="bray")mrppsummary(mrpp)# Teste de hipótese para uma unica variavelstr(mrpp)p_value <- mrpp$Pvalueprint(sprintf("p-value: %.10f", p_value))# Conditional statement to check the p-value and print the appropriate messageif (p_value <0.05) {print(sprintf("Existe diferença significativa porquê o valor de p é %.10f, sendo menor que o nível de significância de 0.05.", p_value))} else {print(sprintf("Não existe diferença significativa porquê o valor de p é %.10f, sendo maior ou igual ao nível de significância de 0.05.", p_value))}##TESTES PAREADOS----library(vegan)# Todos os pares de níveis do fatorpairs <-combn(levels(as.factor(grp)), 2, simplify =FALSE)pairs# Aplicar mrpp para cada parres_list <-lapply(pairs, function(p) { sel <- grp %in% pmrpp(dat = m_trab[sel, ], grouping =droplevels(grp[sel]), distance ="bray")})# Dar nomes aos resultadosnames(res_list) <-sapply(pairs, function(p) paste(p, collapse ="_vs_"))# visualizarres_list# Resumosummary_tab <-data.frame(Comparison =names(res_list),A =sapply(res_list, function(x) x$A),Delta =sapply(res_list, function(x) x$delta),Pvalue =sapply(res_list, function(x) x$Pvalue))summary_tab###MATRIX-STYLE TABLE----library(dplyr)library(tidyr)library(knitr)library(kableExtra)# Extract unique group names from the comparisonsgroups1 <-sort(unique(unlist(strsplit(summary_tab$Comparison, "_vs_"))))# Initialize empty matrixpval_matrix <-matrix(NA, nrow =length(groups1), ncol =length(groups1),dimnames =list(groups1, groups1))# Fill matrix with Pvaluesfor (i in1:nrow(summary_tab)) { comp <-unlist(strsplit(summary_tab$Comparison[i], "_vs_")) g1 <- comp[1] g2 <- comp[2] p <- summary_tab$Pvalue[i] pval_matrix[g1, g2] <- p pval_matrix[g2, g1] <- p #make symmetric}# Format p-values: red if <0.05pval_colored <-ifelse(is.na(pval_matrix), "",ifelse(pval_matrix <0.05,paste0("<span style='color:red;'>", sprintf("%.3f", pval_matrix), "</span>"),sprintf("%.3f", pval_matrix)))# Convert to data.frame for kablepval_df <-as.data.frame(pval_colored)pval_df <- tibble::rownames_to_column(pval_df, "Group")# Render HTML tablematrixstyle <- pval_df %>%kable("html", escape =FALSE) %>%kable_styling(full_width =FALSE)matrixstylesave_kable(matrixstyle, "kb-mrpp_matrixstyle.html")##GRÁFICO DA MRPP----png("fig-mrpp.png", width =15, height =10, units ="in", res =300)def.par <-par(no.readonly =TRUE)layout(matrix(1:2,nr=1))plot(ord <-metaMDS(m_trab, distance="bray"), type="text", display="sites")with(t_grps, ordihull(ord, grp))with(mrpp, { fig.dist <-hist(boot.deltas, xlim=range(c(delta,boot.deltas)),main="Test of Differences Among Groups")abline(v=delta);text(delta, 2*mean(fig.dist$counts), adj =-0.5,expression(bold(delta)), cex=1.5 ) })par(def.par)dev.off()###AVALIANDO AS DISTÂNCIAS MÉDIAS----md <-with(t_grps, meandist(vegdist(m_trab), grp))mdsummary(md)par(mfrow=c(1,2))plot(md)plot(md, kind="histogram")par(mfrow=c(1,1))#PerMANOVA----#browseURL("https://riffomonas.org/code_club/2021-03-18-vegan")#browseURL("https://riffomonas.org/code_club/2021-03-22-reproducibility")library(tidyverse)library(ggtext)library(ggplot2)library(vegan)adonis2(m_trab ~ UA, data = t_grps, method ="bray")set.seed(123)perm <-999all_test <-adonis2(m_trab ~ UA, data = t_grps, method ="bray", permutations = perm)all_testadonis2(m_trab ~ area*ambiente, data = t_grps)adonis2(m_trab ~ area*ambiente, data = t_grps, by ="terms") #ou by=NULL remove as interaçõesall_p <- all_test$`Pr(>F)`[1]all_pcount(t_grps, UA)colnames(t_grps)# Pairwise comparisons#browseURL("https://doi.org/10.1007/s10641-018-0751-1")#Lanés, L.E.K., Reichard, M., de Moura, R.G. et al. 2018.#Environmental predictors for annual fish assemblages in subtropical#grasslands of South America: the role of landscape and habitat#characteristics. Environ Biol Fish 101, 963–977.library(dplyr)#library(combninat) # or base R combn()set.seed(123)# Get unique levels of UAgroups <-unique(t_grps$UA)groups# All pairwise combinationspairs <-combn(groups, 2, simplify =FALSE)pairs# Store resultspairwise_results <-lapply(pairs, function(g) {# Subset data sub_data <- t_grps %>%filter(UA %in% g) sub_dist <- m_trab[rownames(sub_data), ]# Run adonis2 test <-adonis2(sub_dist ~ UA, data = sub_data,method ="bray", permutations = perm)tibble(group1 = g[1],group2 = g[2],F_value = test$F[1],R2 = test$R2[1],p_value = test$`Pr(>F)`[1] )})# Bind all resultspairwise_results <-bind_rows(pairwise_results)pairwise_resultspairwise_p <-numeric()pairwise_p <- pairwise_results$p_valuepairwise_padj_p_values <-p.adjust(pairwise_p, method="bonferroni") #c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")adj_p_valuespairwise_results$adj_p_values <- adj_p_valuespairwise_results# Get unique group namesgroups2 <-sort(unique(c(pairwise_results$group1, pairwise_results$group2)))# Initialize empty matrixpval_matrix <-matrix(NA, nrow =length(groups2), ncol =length(groups2),dimnames =list(groups2, groups2))# Fill in p-valuesfor (i in1:nrow(pairwise_results)) { g1 <- pairwise_results$group1[i] g2 <- pairwise_results$group2[i] p <- pairwise_results$p_value[i] pval_matrix[g1, g2] <- p pval_matrix[g2, g1] <- p #make symmetric}pval_matrixlibrary(knitr)library(kableExtra)# Make a copy of the matrixpval_colored <- pval_matrix# Replace numeric values with formatted HTML stringspval_colored[] <-ifelse(is.na(pval_matrix), "",ifelse(pval_matrix <0.05,paste0("<span style='color:red;'>", sprintf("%.3f", pval_matrix), "</span>"),sprintf("%.3f", pval_matrix)))# Convert to data frame with rownamespval_df <-as.data.frame(pval_colored)pval_df <- tibble::rownames_to_column(pval_df, "Group")# Render tablepval_df %>%kable("html", escape =FALSE) %>%kable_styling(full_width =FALSE)save_kable(matrixstyle, "kb-perma_matrixstyle.html")## AVALIANDO INTERAÇÕES-----adonis2(m_trab ~ area, data = t_grps, method ="bray")adonis2(m_trab ~ area*ambiente, data = t_grps,method ="bray") #+ no lugar de * tira os residuosadonis2(m_trab ~ area*ambiente, data = t_grps,method ="bray", by ="terms", strata =NULL)load("m_dist.RData")bd <-betadisper(m_dist, t_grps$are)bdanova(bd)permutest(bd)bd <-betadisper(m_dist, t_grps$env)bdanova(bd)permutest(bd)bd <-betadisper(m_dist, t_grps$UA)bdanova(bd)permutest(bd, pairwise =TRUE)library(betapart)bd.HSD <-TukeyHSD(bd)plot(bd.HSD)plot(bd)bdist <- bd$distancesbdistplot(bd, ellipse =TRUE, hull =FALSE) # 1 sd data ellipseplot(bd, ellipse =TRUE, hull =FALSE, conf =0.90) # 90% data ellipseboxplot(bd)
Indicator Species Analysis identified ten taxa with significant indicator values (p < 0.05), partially overlapping with the multilevel pattern results. Several taxa were confirmed as indicators of RE, these being Oligochaeta, Conchostraca, Physidae and Coenagrionidae, all showing high indicator values (IndVal = 0.82–0.98, p < 0.03). Corixidae remained the solely indicator of CI (IndVal = 0.86, p = 0.013). Further exploration using Multilevel Pattern Analysis incorporated additional as indicators of specific site combinations, which were Ceratopogonidae and Libellulidae, for CI and RE (IndVal = 0.87-0.94, p < 0.007), Thiaridae and Atyidae, both associated with EP and SE (IndVal = 0.76-0.99, p < 0.048) and Planorbidae, associated with CI, RE, and SE (IndVal = 0.91, p = 0.048) (Figure 8).
Indicator species analises
#BENTOS2006 - ISA----#####....----##ORGANIZANDO DADOS----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consolet_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_trab <- m_denst_grpsload("m_dist.Rdata")#ISA----#browseURL("https://cran.r-project.org/web/packages/indicspecies/vignettes/IndicatorSpeciesAnalysis.html")library(indicspecies)# Grupos a priori----grupos <- t_grps$UAgrupos#OUrep <-c(rep("Serido", 12), rep("Buique", 11))rep#OUlevs <-factor(c(rep(1,12), rep(2,11)), labels =c("Serido","Buique"))levs#OU Grupos a posteriorikm <-kmeans(m_trab, centers=3)gruposkm <- km$clustergruposkm##VALORES INDICATIVOS (IV)----# Multi-level Pattern Analysisset.seed(123)indval <-multipatt(m_trab, grupos,control =how(nperm=999))indvalsink("isa_indval.txt", split=TRUE)# Summary of the resultssummary(indval)# Examining the indicator value componentssummary(indval, indvalcomp=TRUE)# Inspecting the indicator species analysis results for all speciessummary(indval, alpha=1)###RESUMO GERAL----indval$signsubset(indval$sign, p.value <=0.05) #only p<0.05 speciessink()# Tabela ordenada pelo menor valor de Pindval$sign[order(indval$sign$p.value), ]library("gt")library("webshot2")indicator_val <-gt(indval$sign[order(indval$sign$p.value), ], rownames_to_stub=TRUE)indicator_valgtsave(indicator_val, "gt-indval_gttable.html")gtsave(indicator_val, "fig-indval_gttable.png")##SPECIES ECOLOGICAL PREFERENCES (phi)----set.seed(123)pa <-ifelse(m_trab>0,1,0)phi <-multipatt(pa, grupos, func ="r", control =how(nperm=999))phiphi <-multipatt(pa, grupos, func ="r.g", control =how(nperm=999)) summary(phi)round(head(phi$str),3)round(head(indval$str),3)# Subsetting for p<0.05sig_phi <-rownames(subset(phi$sign, p.value <=0.05))round(phi$str[sig_phi, , drop =FALSE], 3)sig_indval <-rownames(subset(indval$sign, p.value <=0.05))round(indval$str[sig_indval, , drop =FALSE], 3)# Tabela ordenada pelo menor valor de Pphi$sign[order(phi$sign$p.value), ]library("gt")library("webshot2")phi_val <-gt(phi$sign[order(phi$sign$p.value), ], rownames_to_stub=TRUE)phi_valgtsave(phi_val, "gt-phi_gttable.html")gtsave(phi_val, "fig-phi_gttable.png")###SPECIES OF INTEREST p<0.05----sig <-subset(phi$sign, p.value <=0.05) #only p<0.05 speciessink("isa_indval.txt", append=TRUE)sigsink()library(ggplot2)sig$species <-rownames(sig)ggplot(sig, aes(x =reorder(species, stat), y = stat)) +geom_col() +coord_flip() +labs(x ="Species",y ="Indicator value (r.g)",title ="Significant species–habitat associations" )assoc <- sig[, grep("^s\\.", names(sig))]assoc$species <-rownames(sig)assoc_long <- reshape2::melt(assoc, id.vars ="species")png("fig-sig_spp.png")ggplot(assoc_long, aes(variable, species, fill = value)) +geom_tile() +scale_fill_gradient(low ="white", high ="black") +labs(x ="Group", y ="Species")dev.off()#R basebarplot( sig$stat,names.arg =rownames(sig),horiz =TRUE,las =1,xlab ="Indicator value (r.g)")
Redundancy analysis (Figure 4) revealed a significant relationship between macroinvertebrate community composition and the site morphology variables, river length, maximum site depth, and site width (Permutation test, F~3,18 = 4.71, p = 0.001). The constrained model explained 43.99% of the total community variation (adjusted R2 = 0.35), with the remaining 56.01% attributable to unconstrained variation. The first RDA axis was highly significant (F~1,18 = 11.33, p = 0.001) and accounted for 80.2% of the constrained inertia, whereas the second and third axes were not significant (p > 0.20). Sequential permutation tests indicated that channel width (F = 8.59, p = 0.001) and river length (F = 3.70, p = 0.013) contributed significantly to explaining community structure, while maximum depth had a weaker, non-significant effect (p = 0.10). Marginal tests confirmed that width and river length remained significant predictors when evaluated independently (p = 0.001). Species scores along RDA1 indicated strong associations of Oligochaeta and larvae of Chironomidae (≈ 40%) with wider sites (indicative of artificial reservoirs), whereas Thiaridae showed a pronounced negative association with RDA1 (90%), corresponding to narrower and shorter river reaches.
Among the water quality variables, the model including only turbidity showed significant relationship with macroinvertebrate community composition (F1,20 = 3.37, p = 0.013). The constrained model explained 14.42% of the total variation in community composition (adjusted R2 = 0.10). The first RDA axis was significant (F1,20 = 3.37, p = 0.01) and accounted for 100% of the constrained inertia, reflecting the presence of a single explanatory variable. Both sequential and marginal permutation tests confirmed that turbidity independently explained a significant proportion of variation in community structure (p = 0.015 and p = 0.008, respectively). Species scores along RDA1 showed strong positive associations of Thiaridae (56%) with higher turbidity values, whereas Oligochaeta (37%) and Notonectidae (18%) were negatively associated with this axis.
Substrate composition revealed a weak overall relationship between macroinvertebrate community composition and the sediment variables. The final reduced model included only mud and sand (permutation test, F2,19 = 1.70, p = 0.089). The constrained model explained 15.21% of the total community variation (adjusted R2 = 0.06). The first RDA axis was marginally significant (F1,19 = 2.70, p = 0.097) and accounted for 79.2% of the constrained inertia, whereas the second axis was not significant (p = 0.60). Sequential permutation tests indicated that sand contributed significantly to explaining community structure (F = 2.69, p = 0.035), while mud did not show a significant sequential effect (p = 0.59). Nonetheless, marginal tests showed that both sand (F = 2.69, p = 0.038) and mud (F = 2.44, p = 0.043) independently explained significant portions of community variation when evaluated while controlling for the other variable. Species scores along first RDA axis indicated a positive association of Chironomidae larvae (50%) and Notonectidae(13%) with sandy substrates, whereas Thiaridae (29%) and Oligochaeta (16%) showed strong negative associations with this axis, corresponding to sites with higher mud content.
Relationship between macroinvertebrate community composition and habitat structure variables was significant, with the final reduced model including macrophytes, submerged vegetation, litter, roots, and algae (F5,16 = 2.89, p = 0.001). The constrained model explained 47.43% of the total variation (adjusted R2 = 0.31). The first two RDA axes were significant, with RDA1 (F1,16 = 8.03, p = 0.003) and RDA2 (F1,16 = 4.00, p = 0.041), together accounting for 83.3% of the constrained inertia. The sequential permutation tests indicated that macrophyte cover (F = 4.34, p = 0.003), leaf litter (F = 2.61, p = 0.038), roots (F = 2.58, p = 0.024), and algae (F = 3.12, p = 0.014) contributed significantly to explaining community structure. Marginal tests confirmed that macrophytes (p = 0.002), roots (p = 0.003), and algae (p = 0.019) remained significant predictors when evaluated independently, while submerged vegetation and leaf litter showed weaker, non-significant marginal effects (p > 0.06). Species scores along RDA1 indicated strong positive associations of larvae of Chironomidae (36%) and Oligochaeta (35%) with sites characterized by higher macrophyte and leaf litter cover, whereas Thiaridae showed a pronounced negative association with this axis (79%), corresponding to habitats with lower structural complexity. Along RDA2, larvae of Chironomidae was positively associated (49%) with increased root and algal presence, whereas Oligochaeta was negatively associated, reflecting an important association with aquatic macrophytes.
Redundancy Analysis
#BENTOS2006 - RDA----#####....----browseURL("https://r.qcbs.ca/workshops/")##ORGANIZANDO DADOS----dev.off() #apaga os graficos, se houver algumrm(list=ls(all=TRUE)) #limpa a memóriacat("\014") #limpa o consolet_grps <-read.csv("t_grps.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dens <-read.csv("m_dens.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)library(vegan)library(tidyverse)m_m <-read.table("m_m.txt")m_q <-read.table("m_q.txt")m_s <-read.table("m_s.txt")m_h <-read.table("m_h.txt")##CLASSIFICAÇÃO 1: MATRIZ COMUNITÁRIA----###DENDROGRAMA E HEATMAP 1----# Dendrogramacluster_uas <-hclust(m_dist, method ="average")plot(cluster_uas, main ="Cluster Dendrogram - Bray-Curtis da Matriz Comunitária",hang =0.1) #testar com -.01rect.hclust(cluster_uas, k =3, h =NULL) #h = 0.8 fornece os grupos formados na altura has.matrix(m_dist)[1:6, 1:6]# Heatmaplibrary("gplots")heatdist <-as.matrix(m_dist)col <-rev(heat.colors(999)) #rev() reverte as cores do heatmapheatmap.2(x=(as.matrix(m_dist)), #objetos x objetosRowv =as.dendrogram(cluster_uas),Colv =as.dendrogram(cluster_uas),key = T, tracecol =NA, revC = T,col = heat.colors, #dissimilaridade = 1 - similaridadedensity.info ="none",xlab ="UA´s", ylab ="UA´s",main ="Comunidade",mar =c(6, 6) +0.2)cluster_spp <-hclust((vegdist(t(m_trns), method ="bray",diag =TRUE,upper =FALSE)), method ="average")plot(cluster_spp, main ="Dendrograma dos atributos")heatmap.2(t(as.matrix(m_trns)), #objetos x atributosColv =as.dendrogram(cluster_uas),Rowv =as.dendrogram(cluster_spp),key = T, tracecol =NA, revC = T,col = col,density.info ="none",xlab ="Unidades amostrais", ylab ="Espécies",mar =c(6, 6) +0.1) #adjust margin size# Histórico das fusõeslibrary(gt)merge <-as.data.frame(cluster_uas$merge)merge[nrow(merge)+1,] =c("0","0")height <-as.data.frame(round(cluster_uas$height, 2))height[nrow(height)+1,] =c("1.0")fusoes <-data.frame(Cluster_uas = merge, Height = height)colnames(fusoes) <-c("Cluster1", "Cluster2", "Height")UA <-rownames_to_column(as.data.frame(m_trns[, 0]))colnames(UA) <-c("UAs")No.UA <-1:nrow(fusoes)fusoes <-cbind(No.UA, UA, fusoes)fusoes$Histórico <-1:nrow(fusoes)#fusoesgt(fusoes)##CLASSIFICAÇÃO 2: MATRIZ AMBIENTAL----###DENDROGRAMA E HEATMAP 2----# Dendrogramam_hab_trns <-read.csv("m_hab_trns.csv",sep =";", dec =".",row.names =1,header =TRUE,na.strings =NA)m_dist_hab <-vegdist(m_hab_trns, method ="bray",diag =TRUE,upper =FALSE)cluster_uas <-hclust(m_dist_hab, method ="average")plot(cluster_uas, main ="Cluster Dendrogram - Bray-Curtis para a Matriz Ambiental",hang =0.1) #testar com -.01rect.hclust(cluster_uas, k =3, h =NULL) #h = 0.8 fornece os grupos formados na altura has.matrix(m_dist_hab)[1:6, 1:6]# Heatmaplibrary("gplots")heatdist <-as.matrix(m_dist_hab)col <-rev(heat.colors(999)) #rev() reverte as cores do heatmapheatmap.2(x=(as.matrix(m_dist_hab)), #objetos x objetosRowv =as.dendrogram(cluster_uas),Colv =as.dendrogram(cluster_uas),key = T, tracecol =NA, revC = T,col = heat.colors, #dissimilaridade = 1 - similaridadedensity.info ="none",xlab ="UA´s", ylab ="UA´s",main ="Dados ambientais",mar =c(6, 6) +0.2)cluster_spp <-hclust((vegdist(t(m_hab_trns), method ="bray",diag =TRUE,upper =FALSE)), method ="average")plot(cluster_spp, main ="Dendrograma dos atributos")heatmap.2(t(as.matrix(m_hab_trns)), #objetos x atributosColv =as.dendrogram(cluster_uas),Rowv =as.dendrogram(cluster_spp),key = T, tracecol =NA, revC = T,col = col,density.info ="none",xlab ="Unidades amostrais", ylab ="Espécies", main ="Dados ambientais",mar =c(6, 6) +0.1) #adjust margin size# Histórico das fusõeslibrary(gt)merge <-as.data.frame(cluster_uas$merge)merge[nrow(merge)+1,] =c("0","0")height <-as.data.frame(round(cluster_uas$height, 2))height[nrow(height)+1,] =c("1.0")fusoes <-data.frame(Cluster_uas = merge, Height = height)colnames(fusoes) <-c("Cluster1", "Cluster2", "Height")UA <-rownames_to_column(as.data.frame(m_hab_trns[, 0]))colnames(UA) <-c("UAs")No.UA <-1:nrow(fusoes)fusoes <-cbind(No.UA, UA, fusoes)fusoes$Histórico <-1:nrow(fusoes)#fusoesgt(fusoes)##CLASSIFICAÇÃO 3: Heatmap Matriz comunitária *vs.* Matriz ambiental----###DENDROGRAMA E HEATMAP 3----# Dendrogramalibrary(vegan)library(gplots)# Dendrograma para as UA's da matriz comunitáriacluster_uas_com <-hclust(m_dist, method ="average")plot(cluster_uas_com, main ="Cluster Dendrogram - Bray-Curtis da Matriz Comunitária",hang =0.1) #testar com -.01rect.hclust(cluster_uas_com, k =3, h =NULL) #h = 0.8 fornece os grupos formados na altura has.matrix(m_dist)[1:6, 1:6]# Dendrograma para as variáveis particionadas da matriz ambientalm_trns_hab <-sqrt(m_hab_part)cluster_mpart_hab <-hclust((vegdist(t(m_trns_hab),method ="bray",diag =TRUE,upper =FALSE)), method ="average")plot(cluster_mpart_hab, main ="Dendrograma dos atributos")# Heatmap Comunidade vs. Morfologiaheatmap.2(t(as.matrix(m_trns_hab)), #objetos x atributosColv =as.dendrogram(cluster_uas_com),Rowv =as.dendrogram(cluster_mpart_hab),key = T, tracecol =NA, revC = T,col = col,density.info ="none",xlab ="Unidades amostrais", ylab ="Varíáveis ambientais",mar =c(6, 9) +0.1) #adjust margin size# Explorando correlações multivariadas da matriz de habitat#plot(m_trns_a$"m.elev", m_trns_a$"m.river")#plot(scale(m_trns_a$"m.elev"), scale(m_trns_a$"m.river"))#plot((m_trns_a$"m.elev" - mean(m_trns_a$"m.elev")) / sd(m_trns_a$"m.elev"))library(psych)pairs.panels(m_trns_hab[,], method ="pearson", # correlation methodscale =FALSE, lm =FALSE,hist.col ="#00AFBB", pch =19,density =TRUE, # show density plotsellipses =TRUE, # show correlation ellipsesalpha =0.5 )#ANÁLISE DE REDUNDÂNCIA----##TRANFORMAÇÃO DA MATRIZ DE COMUNIDADE COM `hellinger`----com_hell <-decostand(m_dens, method ="hellinger")m_trns_hab <- m_m### FAZENDO A RDA----rda <-rda(com_hell ~ ., m_trns_hab) #variavel de resposta ~ variavel explanatóriasummary(rda)#str(rda)modelo_rda <- rda$callmodelo_rda# Testando a significância dos resultados da RDARsquareAdj(rda)anova.cca(rda) #significância globalanova.cca(rda, by="axis") #sign. dos eixos da RDAanova.cca(rda, by="terms") #sign. das var. explanatórias### GRÁFICOS DE RDA----# Gráfico triplot basicoplot(rda, choices =c(1,2))plot(rda, display =c("sp","bp"))# Ajustes no gráfico triplotmodel <-ordiplot(rda, type ="none", scaling =2, xlab ="RDA1", ylab ="RDA2") #type="points" ou "text"##UAspoints(rda, col ="black", pch =21, bg ="gray", cex =1)text(rda, col ="black", cex =0.7, pos =1)##vetorespoints(rda, dis ="bp", col ="red", pch =21, cex =1)text(rda, dis ="bp", col ="red", cex =0.9, pos =2)##espéciespoints(rda, dis ="sp", col ="blue", pch =3, cex =1)text(rda, dis ="sp", col ="blue", cex =0.8, pos =1)# Show only the first [min:max] species with higher valueshigh_species <-order(abs(scores(rda, display ="species")[, 1]), decreasing =TRUE)[1:9]high_species# Get the indices of the first 5 species with higher values on the first axispoints(rda, dis ="sp", select = high_species, col ="darkblue", pch =3, cex =1)text(rda, display ="species", select = high_species, col ="darkblue", cex =0.8, pos =4) #display the selected species### AVALIANDO COLINEARIDADE#(Step1-Remove the worst offender first, Step 2—Recheck VIF, Step 3—Final clean model)# Função variance inflation factorsvif.cca(rda) #remover valores com colinearidade maior que 20rda#rda <- rda(formula = com_hell ~ river_length + depth_hab + depth_max + slope + width, data = m_trns_hab)### ESCOLHENDO AS PRINCIPAIS VARIÁVEIS DO MODELO DA RDA#It is important to note that selecting variables ecologically is much more important than performing selection in this way. If a variable of ecological interest is not selected, this does not mean it has to be removed from the RDA ("https://r.qcbs.ca/workshop10/book-en/redundancy-analysis.html").# GLMrda1 <-step(rda, scope =formula(rda), scale =0, direction ="both", steps =1000, test ="perm") #step refits a new RDA with variables added or removed, based on permutations and AICsummary(rda1)modelo_rda1 <- rda1$call #variáveis sugeridas pelo modelomodelo_rda1#(matriz de dados comunitária ~ variáveis da matriz ambiental)rda2 <-rda(formula(rda1), data = m_trns_hab)#(variáveis de resposta ~ variaveis explanatórias sugeridas pelo modelo stepwise regression)#rda2 <- rda(formula = hell ~ turb, data = m_trns_hab)rda2summary(rda2)#str(rda2)# Colinearidade do novo modelovif.cca(rda2)sink("anovas_rda2_s.txt")# ANOVASRsquareAdj(rda2) #total variance explained and it´s significcanceanova.cca(rda2, perm.max =999)#overall model, do these variables together explain community composition better than random?anova.cca(rda2, by="axis", perm.max =999) #axes, are the canonical axes significant?anova.cca(rda2, by="terms", perm.max =999) #sequential (Type I) effects, does a given variable explain additional variance beyond the one entered before it in the model?anova.cca(rda2, by="margin", perm.max =999) #marginal (Type III) effects, does each variable explain variance after controlling for the other? *REPORT THISsummary(rda2)sink()### INDIVIDUAL FINAL SIMPLIFIED TRIPLOT----# Getting axes explained varianceseig <- rda2$CCA$eigvar_exp <- eig /sum(rda2$tot.chi) *100var_expRDA1_perc <-round(var_exp[1], 1)RDA2_perc <-round(var_exp[2], 1)png("fig-rda_h.png", units ="in", width =8, height =8, res =500)simplified_model <-ordiplot(rda2, type ="n", scaling =2, #ylim = c(-1, 1),xlab =paste0("RDA1 (", RDA1_perc, "%)"),ylab =paste0("RDA2 (", RDA2_perc, "%)"))##UAsUAs <- t_grps$UApoints(rda2, col ="black", pch =21, bg ="gray", cex =1)text(rda2, labels = UAs, col ="black", cex =0.8, pos =2)##vetorespoints(rda2, dis ="bp", col ="red", pch =21, cex =1)text(rda2, dis ="bp", col ="red", cex =0.9, pos =1)##espéciespoints(rda2, dis ="sp", col ="red", pch =3, cex =0.5)#text(rda2, dis = "sp", col = "red", pos = 1)##espécies principais# Show only the first [min:max] species with higher valueshigh_species <-order(abs(scores(rda2, display ="species")[, 1]), decreasing =TRUE)[1:9]high_species# Get the indices of the first 5 species with higher values on the first axis#points(rda2, dis = "sp", select = high_species, col = "darkblue", pch = 3, cex = 1)orditorp(rda2, dis ="sp", select = high_species, col ="darkblue", cex =0.8, pos =4, air =0.5) #display the selected speciesdev.off()##COMBINED FINAL PANEL----library(png)library(grid)library(gridExtra)imgs <-c("fig-rda_m.png","fig-rda_q.png","fig-rda_s.png","fig-rda_h.png")# Read images and convert to grobsimg_grobs <-lapply(imgs, function(x) {rasterGrob(readPNG(x), interpolate =TRUE)})labels <-c("A", "B", "C", "D")img_grobs_labeled <-mapply(function(img, lab) {arrangeGrob( img,top =textGrob(lab, x =unit(0.15, "npc"), y =unit(-0.8, "npc"), #posição das labelsjust =c("left", "top"),gp =gpar(fontsize =14, fontface ="bold")) ) }, img_grobs, labels,SIMPLIFY =FALSE)png("fig-rda_panel.png", units ="in", width =10, height =10, res =500)grid.arrange(grobs = img_grobs_labeled,ncol =2,top =textGrob("", gp =gpar(fontsize =16, fontface ="bold")))dev.off()### CROPPING IMAGENS----library(png)# Read the imageimg <-readPNG("fig-rda_h.png")plot(as.raster(img))# Image dimensionsh <-dim(img)[1]w <-dim(img)[2]# Crop size in pixels (1 inch at 500 dpi)crop_px <-500# Cropimg_cropped <- img[ (crop_px +1):(h - crop_px), (crop_px +1):(w - crop_px), , drop =FALSE]plot(as.raster(img_cropped))# Save cropped imagewritePNG(img_cropped, "cropped_rda_h.png")# Cropping batchfiles <-c("fig-rda_m.png", "fig-rda_q.png","fig-rda_s.png", "fig-rda_h.png")for (f in files) { img <-readPNG(f) h <-dim(img)[1] w <-dim(img)[2] crop_px <-500 img_cropped <- img[ (crop_px +1):(h - crop_px), (crop_px +1):(w - crop_px), , drop =FALSE ]writePNG(img_cropped, paste0("cropped_", f))}
4 Discussion
Species richness and community composition of fish observed in the present study are in accordance with other studies performed in the semiarid region of Brazil (Medeiros and Maltchik 2001a, Medeiros et al. 2010) with the Characidae family being the most representative in terms of richness and abundance. Added to the fact that the order Characiformes includes a large number of fish species from the Brazilian semiarid region (Rosa et al. 2003), most species of Characidae observed in the present study are small-sized with relatively high fecundity and growth rates (Medeiros and Maltchik 2000, 2001b) and generalist in feeding habits and in the use of habitat (Silva et al. 2010, Silva 2012).Compared to other dryland river systems, the study sites showed well oxygenated, transparent and relatively warm waters throughout the study period (see, for instance, Medeiros and Arthington 2011). River reach morphology varied mostly in accordance with the nature of the study site, flowing stream, isolated pool or reservoir, and the aquatic habitat was diverse with a range of habitat elements available for colonization by the aquatic biota. Flowing stream and pool sites showed a similar to greater array of marginal habitat elements and substrate composition when compared to the reservoirs. Studies in semiarid Brazil have shown that the aquatic habitat tends to be richer in streams, when compared to reservoirs and advocate that this richness in underwater structures and the more variable nature of intermittent streams enhance fish diversity (Medeiros et al. 2006, Medeiros et al. 2008).Multiple regression showed that the river reach morphology and habitat structure had the most variables explaining fish richness. The fact that site morphology is an important factor explaining fish richness (mostly width, length and elevation) indicates that these river reach factors are associated with different ranges of environmental variables that support a number of fish species. This is backed up by the multivariate analyses which showed strong spatial segregation of the fish community into spatial assemblage patterns. It has been shown that different assemblages of fish species exhibit preferences for specific habitat types (Martin-Smith 1998). In the present study, the absence of water flow on most sites and the lack of longitudinal connectivity led to specific environmental conditions thus, segregating the fish community to local patterns of species composition. Therefore, in the absence of water flow, the fish community would be expected to assume different compositions across habitat types (namely flowing stream reach, pool, and reservoir) in accordance with the environmental heterogeneity. This is further supported by the significantly different richness across sites, mostly with regard to the Seridó (SE) and Escama-Peixe (EP) stream sites which showed higher average richness.Abundance results were less conclusive, since ANOVA showed no significant difference across sites and the multiple regression reported only dissolved oxygen and overhanging vegetation as important predictors of fish abundance. Natural abundances of fish in Brazilian semiarid aquatic systems vary greatly, both spatially and temporally (Medeiros et al. 2010). This is likely associated with the movement of individuals within small stretches of river and between shallower and deeper areas in larger pools or reservoirs during flooding (Medeiros and Arthington 2008), as well as with patterns of fish recruitment and spawning (Medeiros and Maltchik 2000).The importance of the riparian vegetation and its influence on the aquatic habitat is well recognized (Gregory et al. 1991), as it contributes with shade and underwater structures for fish refuge, as well as spawning sites (Pusey and Arthington 2003). In the present study, variables associated with the presence of riparian vegetation were important predictors of fish richness (overhanging vegetation) and associated with assemblage composition (woody debris and water temperature). These variables are likely contributing with environmental conditions adequate to fish as well as creating local packages of environmental conditions that contribute with the observed spatial segregation in fish community. The presence of aquatic macrophytes in reservoir and stream sites and their importance as predictors for richness and species composition is also an indication of the role played by underwater habitat structures in creating local habitats spatially segregated available for fish use (Xie et al. 2001).The habitat can be viewed as a framework where environmental variables affect biological communities (Southwood 1977). Nonetheless, in hydrologically variable systems, flow variability has been suggested as a major axis influencing on the habitat template (Minshall 1988, Poff and Ward 1989). Dryland aquatic systems in Brazil are highly variable with regard to the presence and magnitude of water flow (Maltchik and Florin 2002) and, even though results indicate that the structure of the habitat and site morphology are important predictors of fish fauna, the habitat template is multidimensional (Southwood 1977) and the various environmental variables measured interact among themselves. In semiarid systems of Brazil, the habitat structure is being driven by many components at different scales (Medeiros et al. 2008), where the role of catchment characteristics and local morphology would increase in predominance at their respective scales. This highlights the importance of the link between habitat structure and the biotic diversity at the local habitat level.Alterations in natural water flow patterns based on reservoir and weir construction, resulting from current management policies for aquatic semi-arid systems in Brazil, modify the hydrological characteristics of the highly variable intermittent streams (Leal et al. 2005, Maltchik and Medeiros 2006). With the changes in the natural dynamics of water flow come also changes in macrophyte assemblage structure, nutrient dynamics and longitudinal connectivity, all of which are associated with the conversion of lotic to lentic systems (Bunn and Arthington 2002). Data from the present study show that these modifications will affect fish communities (richness and composition) since variables highly associated with water flow such as stream reach width, macrophyte cover, overhanging vegetation and dissolved oxygen concentration were important predictors of fish assemblage. Therefore, the modification of natural patterns of water flow and promotion of lentic conditions has the potential to interfere with the fish fauna by favoring more opportunistic species better adapted to no flow conditions. This is corroborated by the fact the reservoir sites had fewer indicator species which were mostly introduced ones, such as Poecilia reticulata and Parachromis managuensis, or typically lentic ones such as Geophagus brasiliensis.This study showed that fish communities assume different structures and compositions across different habitat types in accordance with the environmental heterogeneity associated with the nature of the habitat (reservoirs, flowing streams and pools). Richness and composition of fish were influenced mostly by two sets of environmental variables, being site morphology and habitat characteristics. The first set of variables relates with the catchment spatial hierarchical levels, since river length and elevation are typically associated with the river hierarchy. The second set of variables is typically local with underwater structures being the result of local processes such as river reach water flow and land use/riparian vegetation.
Spatial segregation in macroinvertebrate composition was confirmed by MRPP, which shows significant differences between sites (A = 0.14, p = 0.001), but not for sampling months (A = -0.01, p = 0.815) or seasons (wet/dry) (A < 0.01, p = 0.413). Differences between reservoir and stream sites were also significant (A = 0.03, p = 0.004). CITE paper about time from (Rocha et al. 2012) and explain that variations in time in this study were too wide to pick up variations in macroinvertebrate composition. Also mention that our results are expected given that the sampling design span different basins and regions in the northeast of Brazil.
Overall, both approaches consistently identified RE as the habitat with the highest number of indicator taxa, while CI also supported distinct specialist species. Multilevel pattern analysis emphasized taxa with shared habitat affinities, whereas IndVal highlighted taxa with high fidelity to specific habitat combinations.
Site scores displayed a gradual separation along RDA1 consistent with a gradient from sand-dominated to mud-dominated substrates.
Acknowledgements
The authors are grateful to Virginia Diniz (UFPB) for fieldwork assistance. This research was supported by funds from UEPB/FAPESQ (68.0006/2006.0) and Projeto de Pesquisa em Biodiversidade do Semiárido (PPBio SemiÁrido). Elvio Medeiros is grateful to CNPq/UEPB/DCR for scholarship granted (350082/2006-5).
References
Barbosa, J. E. de L., J. dos S. Severiano, S. Y. L. Costa, B. de F. Terra, E. S. F. Medeiros, and R. F. Menezes. 2025. Rivers of the northeast. Pages 437–465 in M. A. S. Graça, M. Callisto, F. T. de Mello, and D. Rodríguez-Olarte, editors. Rivers of south america. Book Section, Elsevier.
Maitland, P. S. 1990. Field studies: Sampling in freshwaters. Pages 123–148 Biology of fresh waters. Second edition. Book Section, Springer, Dordrecht, The Netherlands.
McCafferty, W. P., and A. V. Provonsha. 1983. Aquatic entomology: The fisherman’s and ecologist’s illustrated guide to insects and their relatives. Page 448. Book, Jones; Bartlett Publishers, Inc.
McCune, B., and J. B. Grace. 2002. Analysis of ecological communities. Page 300. Book, MjM Software Design, Gleneden Beach, Oregon, U.S.A.
Merritt, R. W., K. W. Cummins, and M. B. Berg. 2019. An introduction to the aquatic insects of north america. Page 1480. Fifth edition. Book, Kendall Hunt Publishing Company, Dubuque, IA.
Mugnai, R., J. L. Nessimian, and D. F. Baptista. 2010. Manual de identificação de macroinvertebrados aquáticos do estado do rio de janeiro. Page 176. Book, Technical Books Editora, Rio de Janeiro, RJ.
Oksanen, J., F. G. Blanchet, M. Friendly, R. Kindt, P. Legendre, D. McGlinn, P. R. Minchin, R. B. O’Hara, G. L. Simpson, P. Solymos, M. H. H. Stevens, E. Szoecs, and H. Wagner. 2020. Vegan: Community ecology package. R package version 2.5-7. Report, The Comprehensive R Archive Network Archive.
Pusey, B., M. J. Kennard, and A. Arthington. 2004. Study area, data collection, analysis and presentation. Pages 26–48 in B. Pusey, M. J. Kennard, and A. Arthington, editors. Freshwater fishes of north-eastern australia. Book Section, CSIRO Publishing, Melbourne, Australia. 682 pp.
Ross, S. T., Jr. McMichael Robert H., and D. L. Ruple. 1987. Seasonal and diel variation in the standing crop of fishes and macroinvertebrates from gulf of mexico surf zone. Estuarine, Coastal and Shelf Science 25:391–412.
Sokal, R. R., and F. J. Rohlf. 1995. Biometry: The principles and practice of statistics in biological research. Page 776. Third edition. Book, W.H. Freeman; Company, New York.
Zar, J. H. 1999. Biostatistical analysis. Page 663. 4th edition. Book, Prentice Hall, Upper Saddle River, New Jersey, USA.
Figures and Tables
Figure 1. Study sites across the Seridó/Borborema (RN/PB) and Buíque/Ipojuca (PE), areas in northeastern Brazil. MU, Mulungú reservoir, Salobro reservoir, EP, Escama-peixe stream, RE, Recanto reservoir, CI, Cipó stream, SE, Seridó river. (Seridó river, Site 2: Cipó river, Site 3: Recanto reservoir, Site 4: Escama-Peixe stream, Site 5: Mulungú reservoir and Site 6: Gurjão reservoir.)
Figure 2. Historical averages (2009-2020) of precipitation in the study area (SUDENE 2025).
Figure 1: Principal component analysis of the environmental variables measured during the 2006 hydrological cycle for each group of variables.
Figure 3:NMDS Bi-dimensional solution (stress= 14.09) to macroinvertebrate composition in the studied áreas (Seridó and Buíque), Brazilian semi-arid. Vectors indicate direction and strength of correlation between macroinvertebrate taxa and NMDS axes (r2 > 0.3). Codes indicate sites (S1-S6) and sampling occasions (1-4).
Figure 4. Triplot of the final RDA model showing sites, species, and significant environmental variables. Arrows indicate the direction of increasing environmental gradients. Biplot CCA graphic showing benthic macroinvertebrate composition and predictive environmental variables as defined by analysis. Codes indicate sites (S1-S6) and sampling occasions (1-4).
Table 1: Environmental variables measured during the 2006 hydrological cycle averaged (min-max) per site.
Figure 10: Abundance, percentage and frequency of occurrence (F.O.) of taxa. Seridó stream (SE), Cipó stream (CI) and Recanto reservoir (RE), Escama-Peixe stream (EP), Mulungu reservoir (MU) and Salobro reservoir (SA).
Table I. Density (ind/m2) and frequency of occurence (FO%) of benthic macroinvertebrates collected during 2006 hydrological cycle in Brazilian semi-arid.
Table II. Summary of CCA axes of environmental variables and benthic macroinvertebrates collected from aquatic ecosystems in Brazilian semi-arid during 2006 hydrological cycle.
Figures
Fig.: Environment data PCA - fviz
Figure 1: Environment data Principal Component Analysis using the fviz function.
Fig.: Variation in rarefied richness
Figure 2: Variation in rarefied richness across study sites.
Figure 7: Environment data Principal Component Analysis using the prcomp and plot functions.
Fig.: Multilevel Pattern Analysis of indicator species
Figure 8: Multilevel Pattern Analysis of indicator species.
Tab.: Environment data GT table
Figure 9: Environment data GT table
Tab.: Density data table
Figure 10: Density of species data table.
Tab.: Diversity descriptors data table
Figure 11: Diversity descriptors of the community structure data table.
Non-used codes
Removendo espécies raras
#REMOVENDO ESPÉCIES RARAS----browseURL("https://uw.pressbooks.pub/appliedmultivariatestatistics/chapter/data-adjustments/")library("labdsv")colnames(m_part)m_dom <-dropspc(m_part, minabu=5)m_domcolnames(m_dom)m_5 <-vegtab(comm = m_part, minval = (0.50*nrow(m_part))) #minval=5% of the number of samplest(m_5)colnames(m_5)m_pa <- m_part; m_pa[m_pa !=0] <-1#SEPARANDO B e Spat <-"^B"m_trab <- m_trab[!grepl(pat, rownames(m_trab)), ] #exclui quem começa com pat#REMOVENDO ESPÉCIES RARASbrowseURL("https://uw.pressbooks.pub/appliedmultivariatestatistics/chapter/data-adjustments/")install.packages("labdsv")library("labdsv")colnames(m_part)m_dom <-dropspc(m_part, minabu=5)m_domcolnames(m_dom)m_5 <-vegtab(comm = m_part, minval = (0.50*nrow(m_part))) #minval=5% of the number of samplest(m_5)colnames(m_5)m_pa <- m_part; m_pa[m_pa !=0] <-1